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Self-organized structures in networks with spike-timing dependent plasticity (STDP) are likely 
to play a central role for information processing in the brain. In the present study we derive a 
reaction-diffusion-like formalism for plastic feed-forward networks of nonlinear rate neurons with 
a correlation sensitive learning rule inspired by and being qualitatively similar to STDP. After 
obtaining equations that describe the change of the spatial shape of the signal from layer to layer, 
we derive a criterion for the non-linearity necessary to obtain stable dynamics for arbitrary input. 

We classify the possible scenarios of signal evolution and find that close to the transition to the 
unstable regime meta-stable solutions appear. The form of these dissipative solitons is determined 
analytically and the evolution and interaction of several such coexistent objects is investigated. 

PACS numbers: 05.45.Yv, 82.40.Ck, 87.19.lw, 87.10.Ed 


INTRODUCTION 

Activity in neuronal networks influences their coupling 
structure due to spike time-dependent synaptic plasticity 
(STDP) [1], which, in turn, influences the activity. Sev¬ 
eral works analytically investigated structures appearing 
in networks of neurons with plastic synapses under the 
influence of external signals. These studies required spe¬ 
cial network architectures, investigating e.g. a kind of 
possible ’elementary cell’ of the network [2] or consid¬ 
ered all-to-all connected networks [3] or averages over 
activity realizations in space [4] or systems without con¬ 
tinuous spatial dimension [5]. Hebbian [6] and simi¬ 
lar learning rules, which can be considered as a simple 
STDP-like rules, are used in many neural network mod¬ 
els, like e.g. Hopfield and Boltzmann networks [7]. These 
works employ the energy minimization principle known 
in physics and set the values of synaptic weights accord¬ 
ing to the patterns to be stored, without considering the 
time evolution of the weights explicitly. These systems 
are able to store one level associations between patterns 
and recognize and restore externally applied and previ¬ 
ously learned patterns. The memory capacity as well 
as times to recognize an input have been investigated 
[7]. Activity propagation in non-plastic feed-forward sys¬ 
tems was considered in [8]. A set of integro-differential 
equations, describing a spatially extended network, was 
introduced in [9] as the neural held model. A scheme 
of solution for a simplified version neglecting refractori¬ 
ness [10] was later extended for neurons with adaptation 
[11] and different non-linearities [12]. The spatial and 
structural organization of cortex [13, 14] separates dif¬ 
ferent neural inputs either in real space or in an effec¬ 
tive space that represents a continuum of features. For 
example, the orientation selectivity of neurons in visual 
cortex is represented by neurons topologically arranged 


on a one dimensional ring [15]. One suggested mech¬ 
anism [16, 17] to implement short-term memory is by 
localized bump solutions, which are also considered in 
[12]. The latter work shows the relation between the 
formalism of reaction-diffusion-like systems and spatially 
extended non-plastic neural networks. Our work demon¬ 
strates such a relation for neural networks with long term 
plasticity. The formalism thus fills the gap between a se¬ 
ries of studies of plastic networks without spatial dimen¬ 
sion and the formalism describing activity in spatially ex¬ 
tended networks. The presented analysis therefore opens 
the possibility to study the transfer of short-term mem¬ 
ory, encoded by bump solutions, into long-term memory, 
stored by synaptic modifications. Here we analytically 
consider a feed-forward network with space-dependent 
connectivity and linear-non-linear neurons with a sim¬ 
ple STDP-inspired synaptic learning rule, similar to the 
BCM rule [18]. We reduce the discrete problem by dif¬ 
fusion approximation to obtain an equation similar but 
not exactly equivalent to a reaction-diffusion equation 
with one component, also called Kolmogorov-Petrovsky- 
Piskounov equation [19, 20]. We derive the requirements 
on the non-linearity necessary for a regime of stable sig¬ 
nal propagation, exposing that fine-tuning of parameters 
is necessary, explaining earlier results [21]. The bump 
solutions that are stable in the critical and meta-stable 
in the subcritical regime are described analytically. The 
interneural connections inside a bump are strengthened, 
resembling cell assemblies [22] and showing how exter¬ 
nally presented objects (external input) change intrinsic 
system properties (connectivity). 

We further consider the interaction between bump so¬ 
lutions in dependence of their size, the distance between 
them, and the system parameters. In the regime close 
to stable propagation we find that several coexisting ac¬ 
tivity bumps can either unite with each other or remain 
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Figure 1. Two neighboring layers in the considered feed¬ 
forward architecture. Synapses between layers obey the plas¬ 
ticity rule (1). 


disjoint, depending on the initial conditions. Their unifi¬ 
cation can be interpreted as an emergence of connections 
between cell-assemblies, and, in this way, represents a 
system of associations between internal representations 
of corresponding external stimuli. The improved formal¬ 
ism can be generalized to neural networks with several 
neuron types and to some extent mapped onto the time 
evolution of a recurrent network, opening the possibility 
for further investigations. 


RESULTS 

Model definition 

Here we consider feed forward networks consisting of 
M layers of N neurons each. A neuron at position x in 
layer n -I-1 gets as input m the weighted sum of outputs 
r of an odd number K of neurons from the neighbor¬ 
hood in the previous layer n, m 

where the indices i and x describe the positions of the 
postsynaptic neuron in the layer n -I- 1, and of the presy- 
naptic neuron in the layer n, respectively. Here Wxx is the 
synaptic weight from neuron x to neuron x. The neuron’s 
output r = f{u) is its input u mapped by the non-linear 
function /. This network architecture is illustrated in 
Figure 1. The absence of feedback allows the solution 
of the system, finding the activities and synaptic weights 
for each pair of layers separately using the solution of the 
previous pair as an input. 

We assume a Hebbian plasticity rule 

Wix = -a{wix - Wo) + iTxrs; ( 1 ) 

for the presynaptic neuron x and the postsynaptic neuron 
i, which is similar to the BCM rule [18], but without the 
“sliding” threshold in the non-linearity /. As a result, 
the function / keeps its sign. For simulations we used 
f(x) = A/{1 + — A/(l + e^®). If a stationary 

solution exists it can be found requiring w^x = 0 as 

7 7 

WS:x =Wo + —TxTx = Wq + —f{Ux)f{Ux). (2) 
a a 

In the following we use ^ = uj as the summed input to 
the postsynaptic neuron to better distinguish it from the 
presynaptic neuron’s u. 


In equilibrium, after the learning processes stopped, ^ 
is to be found as a solution of the self-consistent equation 

i+{K-\)j2 

6 = /(?'«) = X] WixTx (3) 

x—x—{K—l) /2 

£+(/f_l)/2 s+(if_i)/2 

= Wo X X 

x—x—{K—l)/2 x—x—{K—l) /2 

with = /(^) and the symbol’denoting the inverse func¬ 
tion. In the approximation neglecting further derivatives 
(and being exact for A’ = 3 and dx denoting the discrete 

lattice derivative) one can replace 'YTa~i^(K ^)/2 

{K+ad1) where a = X]i=o = (AT—l)Ar(Ar-|-l)/24. 

In doing so, we also make the transition from a discrete 
index a: to a continuous variable, also denoted as a;. So, 
processes in the system can be considered as an inter¬ 
play between the diffusion described by the (3^-operator 
and the explicit (given by /) and implicit non-linearities, 
due to the term in (3) resulting from plasticity. The 
equation (3) can be rewritten in this approximation as 

i = wo{K + adl)r + -r^{K + adl)r'^. (4) 

a 

Analysis of global stability and self-reproducing 
solutions 

One can search for possible stable solutions ^(a;) = 
M(a;), satisfying 

f{r)=wo{K + adl)r+-r{K + adl)r'^, (5) 
a 

where r(x) = f{u{x)). There are no terms explicitly 
depending on the variable x, so one can reduce the or¬ 
der of the equation by introducing y(r) = [dxr]{r) the 
derivative of r depending on r. We can hence express 
the second derivatives in (5) as 

dir = dxV = dry dxV = y dry ( 6 ) 

= 2(9a;r)^ -|- 2rdlr = 2y‘^ + 2ry dry = 2z + rdrZ, 

where we used the substitution z{r) = y'^ = (dxx)'^ in 
the last step. Eq. (5) then takes the form of a linear 
differential equation in 2 

f(r) = wo(K r -\- ay dry) -f —ir^K + ar (2z + rdrz))(7) 

a 

= wqK r + —r^K + 2—ar z + (-wqu + —ar"^) drZ, 
a a 2 a 

replacing ydry in the first line with ^drZ in the second 
line. The solution of the corresponding homogeneous 
equation 2^arz + {^Woa -I- ^ar‘^)drZ = 0 is H{r) = 

f iruoXzar'^ ^ ^^(iwoa -f = 
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c a{wo+ 2 ^r‘^) ail arbitrary constant c), and as solu¬ 

tion of (7) one gets 

+ 2ir») / + 

=q{r') 

( 8 ) 

where we used that the factor in front of drZ cancels 
with H~^. As z is the square of a real number and hence 
cannot be negative, a solution existing for all r (starting 
from r = 0) has to satisfy Q{r) = q{r') dr' > 0. We 

are seeking for solutions that start at r = 0. We call 
J'max the largest value of r reached, for which dxr = 0 
and therefore z(rniax) = 0 from which follows 


Q(w) = 0. (9) 

For this case Q{r), q{r) and the stable solution r{x) are 
shown in Figure 2. The physical meaning of q{r) is the 
competition between the non-linearity / and the effec¬ 
tive non-linearity K{wq -\- ^r‘^)r that describe the prop¬ 
agation of r neglecting diffusion: K{wo + ^r'^)r is the 
value of a postsynaptic neuron’s membrane potential ^ 
for the case that presynaptic and postsynaptic neurons 
have u = f{r). So, a positive q{r) indicates a decrease 
of u and r from one layer to the next, and negative q{r) 
corresponds to both measures increasing. So, in a stable 
system we must have Q{r) > 0 for all r > 0. By defini¬ 
tion, Q{r) = 0 for r = 0. If there are no other r at which 
Q{r) = 0, there exists no self-consistent solution propa¬ 
gating from layer to layer without change. If one tries 
to construct the stable solution according to (8) for this 
case, one will always have positive d^r for every r > 0, 
which means that for every r > 0 ’external support’ is 
needed to compensate the diffusion effects, and without 
it every finite signal will decay after some layers. For the 
case of the presence of an r with Q{r) = 0, a solution 
exists having this r as the maximum. If q{r) = drQ{r) at 
that point is negative, an arbitrary small positive pertur¬ 
bation added to this maximum r will grow. So, Q{r) < 0 
for some r means the absence of a stable solution and ex¬ 
plosion of activity for sufficiently strong activation pat¬ 
terns. 

The solution is given in the form z{r) = where 

the choice of a positive or negative sign in taking the 
square root corresponds to a solution that is first in¬ 
creasing or decreasing, respectively, when moving from 
left to right. The length of a “plateau” with dxr = 0 
and Q(rinax) = 0 is an arbitrary parameter of the solu¬ 
tion which can be chosen freely. If two such solutions 
coexists, they lead to the growth of r in the region be¬ 
tween borders of their plateaus and to their junction to 
one without the external borders changing. 

The form of the rest of the bump - the left and right 
wings r = p{x) - is independent of the plateau’s width 
and can be obtained from (8) analytically or by nu¬ 
merical integration as the inverse function of p(r) = 
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Figure 2. A Explicit /(r) (black) and implicit K{wo -I- ^r^)r 
non-linearity (gray). B Their difference, q{r). C Q(r) = 
f q{r)dr, a minimum of Q of 0 enables a stable non-trivial 
solution r{x) shown in D - theoretical prediction of the stable 
solution with dashed line and simulation results with solid 
line. Simulation parameters: N = 800, M = 400, A = 41, 
wo = 0.99/A, 7 = 0.99/A, 9 = 0.6, /3 = 3.6, A = 1.0754. 


f±l/y/zdr = f ±1/ Q{r) dr integrating 

from an arbitrary value of r between 0 and r^nax and 
taking plus for the left or minus for the right wing. 


Long-living solutions in sub-critical regime 

If no rmax > 0 with Q{r^ax) = 0 exists, but (5(7max) 
has a local minimum close to zero, one gets similar struc¬ 
tures as the “immortal” solutions for (9). The bump so¬ 
lutions propagate over large but finite numbers of layers 
before they disappear. This number is approximately 
proportional to the initial width S of the plateau: for S 
larger than A, the successive reduction of this width does 
not affect the processes on the borders, and the velocity s, 
i.e.the rate of reduction of the plateau’s width from layer 
to layer, is approximately constant, depending only on 
network parameters as shown in Figure 3. The velocity 
s can be calculated as the propagation speed that makes 
the solution stable. Obtaining the rate profile for the 
previous layer as a slightly shifted version r[x) — sdxr{x) 
of the profile r{x) in the current layer one obtains from 

(4) 


fir) = wo{K + adl){r - sdxr) + —r{K + adl){r-sdxr)'^ 

a 

or, after sorting, neglecting terms O(s^), using the defi¬ 
nition of q{r) in (8) and multiplying with dxr 


q{r)dxr - swoK{dxr)'^ - sw^adlr dxr + awod'^r dxr = 

—2s—Kr^{dxr)^ + —ardxrd^{r^) — 2s—ard^{rdxr)dxr 
a a a 

which, after integrating over x from minus infinity to the 
beginning of the plateau, leads to 













4 


A 

o 



B 


X 



C 


layer 



Figure 3. A Q{r) near its minimum close to but bigger than 0. 
B Rate profiles r{x) in different layers showing the reduction 
of the plateau’s width for a long-living rate profile leading 
to linear decrease with the layer’s number of f r{x) dx. C 
The theoretical expectation (11) and the simulated result for 
J r(x) dx. Parameters as for Figure 2 apart from A = 1.0745. 


/•’"max ^ 

(3(^max) = -s woKd^r -I- awodlr -|- — {2Kr^dxr 

Jo a 

-I- 2ard^{rdxr) + ard^(r^)) + awod^r dr. (10) 

The last two terms in the latter expression van¬ 
ish, because d^r dr = ydry dr = 

= 0. Further we have r9^(r^) dr = 

\ dl{r^)_ dr^ = ^/g ^29^2 7/2 dr^ 

= 0 ’^ith y 2 = [dx{r^)]{r^) simi¬ 
lar to y{r) = [i9a;r](r), because f^ax is the plateau height 
and derivatives of r-dependent functions are 0. One can 
therefore obtain s with 


pVmax 

s = -Q(rmax)/( / {woKdxr + 2-Kr'^dxr + 

Jo a 

2—ard‘l(rdxr) aw^d^r) dr). (11) 

a 

One can replace dxr with the dependence dxr = \/z{r) 
given by (8) obtained for Q{fnia^) = 0 - this approxima¬ 
tion is meaningful for integrative quantities that are not 
influenced strongly by the local variation of Q near fmax- 
The chain rule allows the replacement of dx with dxr dr, 
in this way one can express s in terms of the integral of 
the function containing z{r) over r. 

To understand how the system represents two or more 
coexistent signals, we investigate the situation with two 
coexistent meta-stable solutions. Two such bumps can 
unite into one if the minimal amplitude of r between their 
borders becomes big enough for self-generated growth be¬ 
fore one of the plateaus disappears. 

Without loss of generality we take a; = 0 for the mid¬ 
point between the bumps. If the distance between the 
two closer ends of the bumps’ plateaus is larger than some 
critical value, the minimal r = rmin at the minimum posi¬ 
tion a; = 0 decays. One can roughly estimate the change 
drmin of Tniin from One layer to the next, approximating 
r(x) near x = 0 with the direct sum of the two bumps’ 
wings, i.e. r = p(x + p(rmin/2)) + p(-x + p(rniin/2)) 
with p(x) denoting a wing of a bump with p(0) = r^ax 
at the plateau’s edge. The factor 1/2 appears because 



Figure 4. A Theoretical prediction of the change of minimal 
activity rmin between bumps after one layer. B Propagation 
of two bumps uniting to one, in C they remain disjoint, be¬ 
cause their initial distance exceeds the maximal distance for 
unification. D Rate profiles in three layers. The activity of the 
bumps is shifted upwards in proportion to the layer number 
for clearer distinction. The evolution over layers illustrates 
the construction of an ’association tree’: first the left and the 
middle bumps are joint, and later the right one. Parameters 
as for Figure 3 apart from j = 0, wo — X.^jK, j3 = 3.63 for 
panels A-C. 


the direct sum of two identical bump solutions approx¬ 
imately produce the value rmin- In this approxima¬ 
tion (5rniin(rniin) = r^^{rmin) “ J’min, where r"^^ = 
f{[wo{K -b adl)r -b r{K + adl)r‘^]\x=o). drmi-a{rmi-a) is 
shown in Figure 4. The critical distance is given by 
P(^min/2)i ^^min(^min) = 0" larger distances no di¬ 
rect unification of two bumps is possible independent of 
their plateau widths. 

The “next generation” of bumps resulting from this 
merging can interact in the same way with each other and 
the bumps of previous generations that are still alive. In 
this way an association tree reflecting the input structure 
can be created as illustrated in Figure 4. 


DISCUSSION 

We obtain equations describing the propagation of ac¬ 
tivity from layer to layer in feed-forward networks of non¬ 
linear rate neurons with an STDP-inspired plasticity rule. 
We find that the stability of the considered network is 
determined by the sign of the minimum of the function 
Q{r)\ for positive sign, every perturbation decays, for 
negative sign activity explodes for any sufficiently strong 
perturbation. If the minimum is 0, stable attractive so¬ 
lutions exist. They have an analytically obtained form 
containing a plateau of variable width. So, precise tun¬ 
ing of parameters is required to get stable propagation of 
activity patterns unchanged from layer to layer, in agree¬ 
ment with earlier simulation results of [21]. A sigmoidal 
form of the nonlinearity / can help to get an alteration 
of the signs of g = drQ{r), necessary to ensure the ex¬ 
istence of a minimum of Q{r) = 0 at a non-vanishing 
activity level r > 0. A sigmoidal / is therefore a nat- 
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ural choice to bring the system close to criticality. The 
qualitative form of the gain function found here is in line 
with the sigmoidal form derived in [23] for feed-forward 
classification networks. The argumentation and the net¬ 
work model of this earlier work are, however, quite dif¬ 
ferent. Their model does not include spatial organiza¬ 
tion. Rather, the authors find the sigmoid resulting from 
the requirement of optimal stability in the recognition 
of presented and previously learned patterns, without in¬ 
vestigating the learning process itself. For the latter they 
employ the known error backpropagation mechanism, fix¬ 
ing the synaptic weights prior to the consideration of the 
neuronal activation dynamics. Employing field theoretic 
arguments, the sigmoid is found as a soliton (kink) solu¬ 
tion, where the input strength to the nonlinearity plays 
the role of space. The dissipative soliton solutions in our 
work, in contrast, are solutions in real space. Similar 
long-living bump solutions exist in systems with positive 
minima of Q{r) close to 0. These solutions decay with a 
velocity increasing with the value of this minimum. Two 
bumps can unite if the interplay of diffusion and nonlin¬ 
earity between them overcomes the decay when propa¬ 
gating from one layer to the next; otherwise they remain 
disjoint. For a united bump the same scenarios exist, so 
a kind of association tree can appear in this way. Quali¬ 
tatively, all interesting results presented in this article do 
not require the existence of the correlation sensitive term 
(7 ^ 0 ) and exist also in the system with static synapses, 
for which the correspondence to differential equations is 
known [12]. For 7 = 0 (5) is rewritten as q{r) = awod^r, 
describing the stable solution of a reaction-diffusion equa¬ 
tion (RDE) for one chemical with the reaction nonlinear¬ 
ity q and diffusion coefficient awQ. One can also interpret 
the layers in the network as different states of the system 
evolution in time on a discretized time grid. The corre¬ 
sponding time-continuous equation dtu = awod^r — q{r) 
with redefined q{r) = f{r')/Tm. — Kwor for the presence 
of a linear leak —u/rm is analog to a RDE for the time 
evolution of a system with one chemical. Dissipative soli- 
tons are known to be solutions of such systems [ 20 ] and 
correspond to the bump solutions ( 8 ). The interesting 
and non-trivial result is, that such solutions exist also 
for arbitrary learning rates 7^0 and that their shape 


can be obtained analytically. 

A more general equation g(r) = 'Yl,iDi{r)d‘lFi{r) of 
the same type as (5), but with an arbitrary number 
of functions under the second derivative is solved with 
the same substitutions z, y: q{r) = '^iDi{r)dlFi{r) = 
E.A(r)((a»9,F,(r) + {d^rfdlF^ir)) 

\{Y.iDi(r)drFi{r))drZ + iJ2i D^{r)^^F,{r))z, 

z{r) = 2H(r) f n,(r')dlF,(r') 

ff(r) = exp(-2/'' dr')- Further sim¬ 

plifications done for (8) were applicable only to this 
particular problem and are impossible in the general 
case. This generalized equation can be used to describe 
e.g. stable solutions for systems with several neuron 
and synapse types, in particular networks including 
inhibitory neurons. There the interaction between 
several bumps can exhibit more complex behavior than 
in the case of non-plastic synapses [12]. This equation 
is similar but not equivalent to an equation describing 
a one-component reaction-diffusion system (for only 
one i this mapping is exact). Still, a similar analysis 
of the existence of stable and meta-stable solutions, as 
presented for (5), is possible for this more general case. 
Preliminary results indicate that associative learning 
and memory can persist even if the activity is restored 
to the baseline level, similar as in [22]. In contrast to 
classical models of associative memory, such as fully 
connected Hopfield networks and Boltzmann machines 
[7], our formalism allows the study of spatially extended 
representations of earlier presented, learned objects. 

With the leak term —ujTm, the evolution of ac¬ 
tivity is described by dtUt{x) = —T^^Ut{x) + 

for stationary 

solutions u{x) = + 

^f{u{x))f{u{x))]f{u{x)), which, after diffusion ap¬ 
proximation of the sum, is equivalent to (5) with 
parameters ry, twq and K — 1 instead of 7 , wq and 
K. So the presented results generalize to stationary 
solutions in recurrent networks. 
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